Metastability of Potential Games 

Diodato Ferraioli* Carmine Ventre^ 



Abstract 

One of the main criticisms to game theory concerns the assumption of full rationality. Logit 
dynamics is a decentralized algorithm in which a level of irrationality (a.k.a. "noise") is introduced 
in players' behavior. In this context, the solution concept of interest becomes the logit equilibrium, 
as opposed to Nash equilibria. Logit equilibria are distributions over strategy profiles that possess 
several nice properties, including existence and uniqueness. However, there are games in which 
their computation may take exponential time. We therefore look at an approximate version of logit 
equilibria, called metastable distributions, introduced by Auletta et al. |6). These are distributions 
which remain stable (i.e., players do not go too far from it) for a super-polynomial number of steps 
(rather than forever, as for logit equilibria). The hope is that these distributions exist and can be 
reached quickly by logit dynamics. 

We show that any exact potential game admits metastable distributions no matter what level of 
noise is present in the system, and what profile the dynamics starts from. These distributions can be 
quickly reached if the rationality level is not too big when compared to the inverse of the maximum 
difference in potential. For higher values of the rationality level, the result is proved under the 
assumption that the dynamics starts from a rich set of strategy profiles. Our proofs build on results 
which may be of independent interest. Namely, we prove some spectral characterizations of the 
transition matrix defined by logit dynamics for generic games and relate bottleneck ratio and hitting 
time of Markov chains. 
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1 Introduction 



One of the most prominent assumptions in game theory dictates that people are rational. This is con- 
trasted by many concrete instances of people making irrational choices in certain strategic situations, 
such as stock markets [27]. This might be due to the incapacity of exactly determining one's own utili- 
ties: the strategic game is played with utilities perturbed by some noise. 

Logit dynamics [7] incorporates this noise in players' actions and then is advocated to be a good 
model for people behavior. More in detail, logit dynamics features a rationality level j3 > (equivalently, 
a noise level and each player is assumed to play a strategy with a probability which is proportional 
to the corresponding utility to the player and /3. So the higher (3 is, the less noise there is and the more 
rational players are. Logit dynamics can then be seen as a noisy best-response dynamics. 

The natural equilibrium concept for logit dynamics is defined by a probability distribution over the 
pure strategy profiles of the game. Whilst for best-response dynamics pure Nash equilibria are stable 
states, in logit dynamics there is a chance, which is inversely proportional to /3, that players deviate 
from such strategy profiles. Pure Nash equilibria are then not an adequate solution concept for this 
dynamics. However, the random process defined by the logit dynamics can be modeled via an ergodic 
Markov chain. Stability in Markov chains is represented by the concept of stationary distributions. These 
distributions, dubbed logit equilibria, are suggested as a suitable solution concept in this context due to 
their properties [5 ]. For example, from the results known in Markov chain literature, we know that any 
game possesses a logit equilibrium and that this equilibrium is unique. The absence of either of these 
guarantees is often considered a weakness of pure Nash equilibria. Nevertheless, as for Nash equilibria, 
the computation of logit equilibria may be computationally hard depending on whether the chain mixes 
rapidly or not 0. 

As the hardness of computing Nash equilibria justifies approximate notions of the concept EUlfTOl . 
so Auletta et al. look at an approximation of logit equilibria that they call metastable distributions. 
These distributions remain stable for a time which is long enough for the observer (in computer science 
terms, this time is assumed to be super-polynomial) rather than forever. Roughly speaking, the stability 
of the distributions in this concept is measured in terms of the generations living some historical era, 
while stationary distributions remain stable throughout all the generations. When the convergence to 
logit equilibria is too slow, then there are generations which are outlived by the computation of the sta- 
tionary distribution. For these generations, metastable distributions appear as a reasonable equilibrium 
concept. (We refer the interested reader to [6 ] for a complete overview of the rationale of metastability.) 
It is unclear whether and which strategic games possess these distributions and if logit dynamics quickly 
reaches them. 

The focus of this paper is the study of metastable distributions for the class of potential games 1211 . 
Potential games are an important and widely studied class of games modeling many strategic settings. 
Each such game satisfies a number of appealing properties, the existence of pure Nash equilibria being 
one of them. A study of the metastability of potential games is especially interesting due to the known 
hardness results, see e.g. |[T3ll . which suggest that the computation of pure Nash equilibria for them is 
an intractable problem, even for centralized algorithms. 

Our contribution. We prove that any n-player potential game has a metastable distribution for each 
starting profile of the logit dynamics. These distributions remain stable for a time which is super- 
polynomial in n, if one is content of being within a distance 1 / poly (n) from the distributions. (The 
distance is defined in this context as the total variation distance, see below.) To maintain n as our only 
parameter of interest, we assume that the number of strategies available to players is upper bounded by 
a polynomial in n; this assumption can, however, be relaxed to prove bounds asymptotic in n and in the 
logarithm of the maximum number of strategies. 

Regarding the convergence rate to these metastable distributions, called pseudo-mixing time, the 
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results we show are more variegated. The pseudo-mixing time is polynomial in n for any value of (3 
when the dynamics starts from a rich set of states. In the case in which the dynamics starts outside this 
set then we prove that the pseudo-mixing time is polynomial in n only for certain values of j3. Namely, 
we need to assume that f3 is not too big when compared to the (inverse of the) maximum difference in 
potential of neighboring profiles. Note that when /3 is very high then logit dynamics is "close" to best- 
response dynamics and therefore it is impossible to prove quick convergence results for any potential 
game due to the aforementioned hardness results. We then give a picture which is, in a sense, as complete 
as possible without restricting the class of potential games considered. Nevertheless, the study of the 
pseudo-mixing time for high values of /3 remains an interesting open question, even for some subclass 
of potential games. 

The proofs of the above results build on a novel conceptual vision of potential games, and a number 
of involved technical contributions, some of which might be of independent interest. For the former 
aspect, note that the potential function associated to a n-player potential game can assume different 
values for a fixed value of n (e.g., a congestion game has different potential values for different facil- 
ities' latency functions (26J). We can subdivide these values into different classes, each containing a 
unique potential value for each n. Our results for potential games follow from studying existence of, 
and convergence to, metastable distributions for each class independently. Regarding the technical con- 
tributions involved in our work, we mainly study properties of Markov chains. (As mentioned above, 
logit dynamics defines a Markov chain over the set of pure strategy profiles.) The concepts of interest 
are mixing time (how long the chain takes to mix), bottleneck ratio (intuitively, how hard it is for the 
stationary distribution to leave a subset of states), hitting time (how long the chain takes to hit a certain 
subset of states) and spectral properties of the transition matrix of Markov chains. To prove the exis- 
tence of metastable distributions, we define a procedure which iteratively removes from the set of pure 
strategy profiles the sets of states from which logit dynamics leaves with "small" (i.e., the inverse of 
a super-polynomial in n) probability. We call these sets the "core" of the metastable distributions. To 
individuate them, we need to prove a connection between bottleneck ratio and hitting time. Specifically, 
we prove an upper bound on the hitting time of a subset of states in terms of the bottleneck ratio of its 
complement. To prove that the pseudo-mixing time is polynomial in n when the starting profile belongs 
to the "core," we firstly relate the pseudo-mixing time to the mixing time of a certain family of restricted 
Markov chains. We then prove that the mixing time of these chains is polynomial by bounding the 
difference in potential of states in the "core" and by using a spectral characterization of the transition 
matrix of restricted Markov chains. Finally, the proof that the pseudo-mixing time is polynomial when 
the dynamics starts outside the "core" mainly relies on another relation we prove between bottleneck 
ratio and hitting time. The former is this time shown to give a certain lower bound on the latter. 

We complement the above contributions with further spectral results about the transition matrix 
of Markov chains defined by logit dynamics for a strategic (not necessarily potential) game and an 
application of the ideas used for proving metastability on a particular class of potential games (Ising 
games on cliques). The latter result closes an open problem left by 0]. 

Related works. Blume [7] introduced logit dynamics for modeling a noisy-rational behavior in game 
dynamics. Early works about this dynamics have focused on its long-term behavior: Blume Q showed 
that, for 2 x 2 coordination games and potential games, the long-term behavior of the system is concen- 
trated around a specific Nash equilibrium; Aids-Ferrer and Netzer [ 1 ] gave a general characterization 
of long-term behavior of logit dynamics for wider classes of games. Several works gave bounds on the 
time that the dynamics takes to reach specific Nash equilibria of a game: Ellison lfl2l considered logit 
dynamics for graphical coordination games on cliques and rings; Peyton Young [25 ] and Montanari and 
Saberi ll22ll extended this work to more general families of graphs; Asadpour and Saberi [2] focused on 
a class of congestion games. Auletta et al. 121 were the first to propose the stationary distribution of the 
logit dynamics Markov chain as a new equilibrium concept in game theory and to focus on the time the 
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dynamics takes to get close to this equilibrium J4J. 

In physics, chemistry, and biology, metastability is a phenomenon related to the evolution of systems 
under noisy dynamics. In particular, metastability concerns moves between regions of the state spaces 
and the existence of multiple, well separated time scales: at short time scales, the system appears to be 
in a quasi-equilibrium, but really explores only a confined region of the available space state, while, at 
larger time scales, it undergoes transitions between such different regions. Research in physics about 
metastability aims at expressing typical features of a metastable state and to evaluate the transition time 
between metastable states. Several monographs on the subject are available in physics literature (see, for 
example, lfl4ll23l l8l[T5lO. Auletta et al. O applied metastability to probability distributions, introducing 
the concepts of metastable distribution and pseudo-mixing time for some specific potential games. 

Roughly speaking, metastability is a kind of approximation for stationarity. From this point of 
view, metastable distributions may be likened to approximate equilibria. Two different approaches to 
approximated equilibria have been proposed in literature. In the multiplicative version ifTUll a profile is 
an approximate equilibrium as long as each player gains at least a factor (1 — e) of the payoff she gets by 
playing any other strategy: these equilibria have been shown to be computationally hard both in general 
ifTTTl and for congestion games ll28l . In the additive version ifTTIl . a profile is an approximate equilibrium 
as long as each player gains at least the payoff she gains by playing any other strategy minus a small 
additive factor e > 0: for these equilibria a quasi-polynomial time approximation scheme exists ll20l but 
it is impossible to have an FPTAS 0. 

2 Preliminaries 

A strategic game Q is a triple ([n],S,U), where [n] = {1, ... , n} is a finite set of players, S = 
(Si, . . . , S n ) is a family of non-empty finite sets (Si is the set of strategies available to player i), and 
U = (ui, . . . , u n ) is a family of utility functions (or payoffs), where Uj : S — > R, S = Si x . . . x 5 n 
being the set of all strategy profiles, is the utility function of player i. We let m denote an upper bound 
to the size of players' strategy sets, that is, m > maxj = i v .. jn \Si\. We focus on (exact) potential games, 
i.e., games for which there exists a function $ : S — > K such that for any pair of x, y € S, y = (x_j, 
we have: 

$(x) - $(y) = m(y) - «i(x). 

Note that we use bold symbols to denote vectors and the standard game theoretic notation (x_j, s) to 
mean the vector obtained from x by replacing the i-th entry with s; i.e. (x_j,s) = (xi, . . . ,x l _ 1 , s,x l+1 , 
. . . , x n ). A strategy profile x is a Nash equilibriurrj^if, for all i, Uj(x) > Uj(x_i, Sj), for all Sj E S. t . It 
is fairly easy to see that local minima of the potential function correspond to the Nash equilibria of the 
game. 

For two vectors x, y, we denote with H(x,y) = Xi / yi}\ the Hamming distance between 
x and y. For every x G S, N(x) = {y € S: H(x,y) = 1} denotes the set of neighbors of x and 
iVj(x) = {y € N(x) : y_» = x_j} is the set of those neighbors that differ exactly in the z-th coordinate. 

In this paper, given a set of profiles L we let L denote its complementary set, i.e., L = S \ L. We 
will adopt also the following asymptotic notation to distinguish growth rate of functions. To say that a 
function f(n) is upper bounded by a polynomial in n we will write f(n) = e°( 1 °s n ); similarly, to say 
that f(n) is lower bounded by a super-polynomial in n, we will write f(n) = e ul ^ ogn > . 

2.1 Logit dynamics 

The logit dynamics has been introduced in and runs as follows: at every time step (i) Select one 
player i £ [n] uniformly at random; (ii) Update the strategy of player i according to the Boltzmann 

1 In this paper, we only focus on pure Nash equilibria. We avoid explicitly mentioning it throughout. 
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distribution with parameter (3 over the set Si of her strategies. That is, a strategy Sj G Si will be selected 
with probability 

a i (si|x-i) = — ^ e ^ x -^), (1) 

where x_ , G S_i =5iX...x Sj_i x Sj+i x . . . x S n is the profile of strategies played at the current time 
step by players different from i, Zj(x_i) = Ylz eS- e^ Ui ^— %,z ^ is the normalizing factor, and j3 > 0. 
One can see parameter f3 as the inverse of the noise or, equivalently, the rationality level of the system: 
indeed, from ([I]), it is easy to see that for j3 = player i selects her strategy uniformly at random, for 
/3 > the probability is biased toward strategies promising higher payoffs, and for j3 that goes to infinity 
player i chooses her best response strategy (if more than one best response is available, she chooses one 
of them uniformly at random). 

The above dynamics defines a Markov chain {Xt}teN with the set of strategy profiles as state space, 
and where the transition probability from profile x = (x\, . . . , x n ) to profile y = (yi, . . . , y n ), denoted 
P(x, y) = P x (X\ = y^J is zero if if(x,y) > 2 and it is ^o~i(yi | x_j) if the two profiles differ 
exactly at player i. More formally, we can define the logit dynamics as follows. 

Definition 2.1. Let Q = ([n], S,U) be a strategic game and let (5 > 0. The logit dynamics/or Q is the 
Markov chain Mp = ({X^^n, S, P) where S = Si X • • • X S n and 



n 



a(yi | x_j), ify-i = x_j and yi ^ xf, 

n 

YtViivi I x -0> ify = x ; (2) 

i=l 

0, otherwise; 



where o~i{yi \ x_j) is defined in ([T]). 



The Markov chain defined by (|2]) is ergodic 0. Hence, from every initial profile x the distribution 
P t (x, •) over states of S of the chain Xt starting at x will eventually converge to a stationary distribution 
7r as t tends to infinity. As in [0, we call the stationary distribution it of the Markov chain defined by 
the logit dynamics on a game Q, the logit equilibrium of Q. In general, a Markov chain with transition 
matrix P and state space S is said to be reversible with respect to a distribution ir if, for all x, y G 5, 
it holds that 7r(x)P(x, y) = vr(y)P(y, x) . If an ergodic chain is reversible with respect to it, then it is 
its stationary distribution. Therefore when this happens, to simplify our exposition we simply say that 
the matrix P is reversible. For the class of potential games the stationary distribution is the well-known 
Gibbs measure. 

Theorem 2.2 ([7]). If Q = ([n],S,U) is a potential game with potential function then the Markov 
chain given by (|2]) is reversible with respect to the Gibbs measure 7r(x) = ^e~^^\ where Z = 
^2 ye s e _/3 *( y ) is the normalizing constant. 

It is worthwhile to notice that logit dynamics for potential games and Glauber dynamics for Gibbs 
distributions are two ways of looking at the same Markov chain (see [7 ] for details). This, in particular, 
implies that we can write 

e -/3$(x_i,Si) 

o-i( Si | x_i) = — Wx _ ij2) • 



throughout this work, we denote with P x (■) the probability of an event conditioned on the starting state of the logit 
dynamics being x. 
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2.2 Convergence of Markov chains 

Mixing time. Arguably, the principal notion to measure the rate of convergence of a Markov chain to 
its stationary distribution is the mixing time, which is defined as follows. Let us set 

d(t) = max ||P*(x, •) — 7r|| TV , 

where the total variation distance \\ \i — i/|| TV between two probability distributions [i and v on the same 
state space S is defined as 

IIa* — "Htv = ma f - V ( A )\ = \Y1 l^( x ) ~ "Ml • 

For < e < 1/2, the mixing time of the logit dynamics is defined as 

t mix ( e ) = min{t G N: d(t) < e}. 

It is usual to set e = 1/4 or e = l/2e. We write t m [ x to mean i m i x (l / 4) and we refer generically to 
"mixing time" when the actual of e is immaterial. Observe that t m i x (e) < [log 2 i m ix- 

Relaxation time. Another important measure of convergence for Markov chains is given by the relax- 
ation time. Let P be the transition matrix of a Markov chain with finite state space S; let us label the 
eigenvalues of P in non-increasing order 

Ai > A2 > • • • > A|5|. 

It is well-known (see, for example, Lemma 12.1 in lfi~9l ) that Ai = 1 and, if P is irreducible and 
aperiodic, then A 2 < 1 and A|s| > — 1. We set A* as the largest eigenvalue in absolute value other than 
Ai, 

A* = max ||Aj|} . 
i=2,...,|S| 

The relaxation time t re i of a Markov chain M. is defined as 

1 

It turns out that mixing time and relaxation time are strictly related (see results summarized in Ap- 
pendix [All. 

Hitting time. In some cases, we are interested in bounding the first time that the chain hits a profile 
in a certain set of states, also known as its hitting time. Formally, for a set L C S, we denote by t l 
the random variable denoting the hitting time of L. Note that the hitting time, differently from mixing 
and relaxation time, depends on where the dynamics starts. Some useful fact about hitting time are 
summarized in Appendix [B] 

Bottleneck ratio. Quite central in our study is the concept of bottleneck ratio. Consider an ergodic 
Markov chain with finite state space S, transition matrix P, and stationary distribution n. The probability 
distribution Q(x, y) = 7r(x)P(x, y) is of particular interest and is sometimes called the edge stationary 
distribution. Note that if the chain is reversible then Q(x, y) = Q(y, x). For any L C S, L / 0, we let 
Q(L, S\L) = X^xeL yeS\L Q( x i y)- Then the bottleneck ratio of L is 

Q(L, S\L) 
{) = vr(L) • 

Throughout the paper we assume that the bottleneck ratio of the entire strategy space S is zero, that is, 
B(S) = 0. Useful facts about the bottleneck ratio, used in the sequel, are surveyed in Appendix |C| 
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2.3 Metastable distributions 



In this section we give formal definitions of metastable distributions and pseudo-mixing time. We also 
survey some of the tools used for our results. For a more detailed description we refer the reader to [6]. 

Definition 2.3. Let P be the transition matrix of a Markov chain with finite state space S. A probability 
distribution /i over S is (e, 7~)-metastable/or P (or simply metastable, for short) if for every < t < T 
it holds that 

||a»-p* - m||tv ^ £ - 

The definition of metastable distribution captures the idea of a distribution that behaves approxi- 
mately like the stationary distribution: if we start from such a distribution and run the chain we stay 
close to it for a "long" time. Some interesting properties of metastable distributions are discussed in 0, 
including the following lemmata, that turn out to be useful for proving our results. 

Lemma 2.4 (10). Let P be a Markov chain with finite state space S and stationary distribution n. For 
a subset of states L C S let ir l be the stationary distribution conditioned on L, i.e. 

-l(x) = H x)/ir(£) ' i/x€L; (3) 
I 0, otherwise. 

Then, tt^ is (B(L), \)-metastable. 

Lemma 2.5 ([6]). If [i is (e, l)-metastable for P then p, is (eT, T) -metastable for P. 

Among all metastable distributions, we are interested in the ones that are quickly reached from a 
(possibly large) set of states. This motivates the following definition. 

Definition 2.6. Let P be the transition matrix of a Markov chain with state space S, let L C S be a 
non-empty set of states and let fi be a probability distribution over S. We define the pseudo-mixing time 
as 

fj(e) = inf{t € N: ||P*(x,-) - fi\\ TY < eforallx£ L} . 

Since the stationary distribution ir o f an ergodic Markov chain is reached within e in time t m i x (s) 
from every state, according to Definition 2.6 we have that if (e) = i m ix(e)- The following simple lemma 



connects metastability and pseudo-mixing time. 

Lemma 2.7 ([51). Let fi be a (e, T)-metastable distribution and let L C S be a set of states such that 
tj^(e) is finite. Then for every x € L it holds that ||P*(x, •) — m|| tv < 2e for every ij^(e) < i < 



3 Spectral properties of the logit dynamics 

In [ 3 ] it has been shown that all the eigenvalues of the transition matrix of logit dynamics for potential 
games are non-negative. The technique used in that proof can be generalized to work also for some 
restrictions of these matrices. 

To begin, we note that the definition of reversibility can be extended in a natural way to any square 
matrix and probability distribution over the set of rows of the matrix. We then state a fairly standard 
result relating eigenvalues of matrices and certain inner products. 

Lemma 3.1. Let P be a square matrix on state space S and tt be a probability distribution on S. If P is 
reversible with respect to n and has no negative eigenvalues then for any function f : S -}Rwc have 

(PfJ), := J>(x)(Pf)(x)/(x) > 0. 
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Proof. Let Ai, . . . , X s , s = be the eigenvalues of P. Moreover, let fx, ■ ■ ■ , f s denote their corre- 
sponding eigenf unctions. For any x G S, we then have (P/i)(x)/j(x) = Aj/j(x). Since P is reversible 
then we know that the eigenfunctions assume real values and that they form an orthonormal basis for the 
space (W , (•, •)„■) (see, e.g., Lemma 12.2 in [ 19]). Then any real-valued function / defined upon S can 
be expressed as a linear combination of the fa's. Thus, there exist aj's in R such that 

s s 

J>(x)(P/)(x)/(x) = ^ W (x) ^^(P/OCxJ/iCx) = £>(x) ^afA^fx) > 0. □ 

xeS xgS i=l xgS i=l 

To specify the restrictions of the transition matrix we are interested in, let Q be a game with profile 
space S and let P be the transition matrix of the logit dynamics for Q; we say that a \ A\ X |A| matrix P', 
with A C 5, is a nice restriction of P if there exists L C i, L ^ (3, such that P'(x, x) > P(x, x) for 
x € L, P'(x, y) = P(x, y) if x, y G L, x / y, and is otherwise. Note that P is a nice restriction of 
itself. We generalize the result given in to nice restrictions of the transition matrix of logit dynamics 
for potential games. 

Theorem 3.2. Let Q be a game with profile space S, let P be the transition matrix of the logit dynamics 
for Q and let P' be a nice restriction of P with state space A. If P is reversible with respect to n then 
no eigenvalue of P' is negative. 

Proof. Firstly, note that if P is reversible with respect to it then the nice restriction P' , defined upon a 
subset of states A, is reversible with respect to tt 7 defined as it restricted to A, i.e., 7r'(x) = 7r(x)/7r(^4) 
for x e A. 

Assume for sake of contradiction that there exists an eigenvalue A < of P'. Let f\ be an eigen- 
function of A. Note that since P is reversible then f\ is real-valued. By definition, fx ^ 0; hence, 
since A < and as (P'/^)(x) = A/x(x), then for every profile x G A such that /a( x ) / Owe have 
sign ((P/a)(x)) + sign (/ A (x)) and thus 

(P'fx,fx),> = 5>'(x)(P7a)(x)/a(x) < 0. 

xeA 

Let L denote the maximal subset of A for which P' is a nice restriction of P. Let us denote with 
P L the transition matrix on the state space A such that P L (x, y) = P(x, y) for every x, y G L and 
P L (x, y) = otherwise. Then we can write P' as P L + (P' — P L ): by the definition of nice restriction 
(P' — P L ) is a non-negative diagonal matrix. Therefore, (P' — P L ) is reversible with respect to 7r'. 
Since the eigenvalues of a diagonal matrix are exactly the diagonal elements, we have that (P' — P L ) 



has non-negative eigenvalues and then, by Lemma 3.1 ((P' — P L )f\, f\) w ' > 0. Moreover, for every i 



and for every z_ j, we denote with Pi iZ _, the matrix such that for every x, y £ A 

1 f e ^«t(y) if x_j = y_j = z_j and x, y € L ; 
P, z _,(x,y)= , . ^ (4) 

nZj(z_j) (^0, otherwise. 

Observe that Pj )Z _ 1 has at least one non-zero row and that all non-zero rows of Pi iZ _i are the same. Thus 
Pi,z_i has rank 1, and hence since it is a non-negative matrix all its eigenvalues are non-negative |[T6lj ^| 
Moreover, since all off-diagonal entries of Pi. z _ t are either or equal to the corresponding entry of P' we 



can conclude that Pi )Z _ i is reversible with respect to n'. Thus, Lemma 3.1 yields {Pi^^fx, fx)n' > 0. 
Finally, observe that P L = ^ ^ z Pj, z _ 4 - Hence from the linearity of the inner product, it follows 
that (P' fx, /a)tt' > and thus we reach a contradiction. □ 



3 This result about the eigenvalues of matrices with rank 1 appears as an exercise at page 61 of [ 16| and in 1241 . 
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The theorem above turns out to be very useful to prove our main result presented in the next section. 

We next give other interesting spectral results about the transition matrix generated by the logit 
dynamics. In particular, by using a matrix decomposition similar to the one adopted in the proof of 
Theorem |3.2| we can prove the following theorem. (We remark that the next results in this section do not 
need to assume that the chain is reversible and indeed apply to any strategic game.) 

Theorem 3.3. Let Q be a game with profile space S and let P be the transition matrix of the logit 
dynamics for Q. The trace of P is independent of f3. 

Proof. For every i and for every z_j consider the transition matrices P^z^ defined in Q, with L = S. 

Let <S'i, z _ i = {(z_j,Sj) | Si G Si}. Observe that for every x G Si jZ _ i we have Pj jZ i (x, x) = 1 — 
EyeSi.s.^x p ( x > y)- Hence, the trace of P ijZ _ i is 

£ n z _ i (x,x) = |5 i |- £ £ P(x,y). 

xe5 i , z _ i xeS'i iZ _ i yeS ijZ _ i ,y^x 

Since all non-zero elements in a column of Pi, z _ i are the same we also have 

p ^-i( x > x ) = KxrT E p (y> x )- 

* yes i , z _ i ,y/x 

By setting C = £ xe5 .,_. SyeS^^x p ( x , y) = E ye 5 M _ i>y ^x x )> we have 

C 

1 5*j | — C = -p^-j - — > C = | Si | — 1 , 

and thus, the trace of Pi jZ _i is always 1, regardless of /3. The theorem follows since the trace of P is 
exactly the sum of the traces of all P^z./s. □ 

The theorem above says that if there exists an eigenvalue of P that gets closer to 1 as (3 increases, 
then there are other eigenvalues that get smaller: this is very promising in the tentative to characterize 
the entire spectrum of eigenvalues of P, necessary to use powerful tools such as the well-known random 
target lemma H 1 911 . 

In order to prove our last characterization of the transition matrix generated by the logit dynamics, 
we prove the following lemma which gives a lower bound on the probability that the strategy profile is 
not changed in one step of the logit dynamics for a generic game. 

Lemma 3.4. Let Q be a game with profile space S and let P be the transition matrix of the logit dynamics 
for Q. Then for every x G S we have that 



P(x,x) = ^p((x_ i ,4),x), 



where s* ^ Xi is an arbitrary strategy of player i. 
Proof. Observe that 



P(x,x) = l- £ p (x,y) = W~ £ P(x,y)] 

yeiv(x) i \ yeJVi(x) / 

= y 1 -d- y e — )=y 1 

* \ yeA^x) ^zeJVi(x) j i 



e /9«i(x) 
(x) + Eze^x) e ^ (z) 
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The proof concludes by observing that for every i and for every s* E Si, we have 



I e /3«i( x ) _ 

P[ (X-i,Sj),x) = — — — t-t— — — fl„.,f*1 • 1=1 



n e /3«,(x) + J] ze7Vs(x) e ^( z ) 

Lemma l3~4l allows us to calculate the determinant of P. 

Theorem 3.5. Let Q be a game with profile space S and let P be the transition matrix of the logit 
dynamics for Q. Then the determinant of P is 0. 

Proof. It is well-known that a matrix in which one row can be expressed as a linear combination of other 
rows has determinant zero. In this proof, we fix a profile x and show that the row of P corresponding to 
x can be obtained as a linear combination of other rows of the matrix. For each player i, fix a strategy 
s* E Si such that s* ^ Xj. Let us denote with S^, j = 0, . . . , n, the set of profiles y E S obtained 
from x by selecting j players ii, . . . , ij and setting their strategies to s* , . . . , s* ., respectively. Notice 
that x belongs to 5°. By construction, for every profile z E 5 J , z-i E {xi, s*}. Now, for i = 1, . . . , n, 
consider the profile obtained from z by changing Zi = Xi into s* or viceversa. Note that there are n of 
such profiles which are neighbors of z and all contained in the sets S 3 and We claim that for 

every y E 5 

n 

P(x,y) = ^(-iy +1 ^P(z,y). (5) 

In order to prove the claim we distinguish three cases: 

1. Let H (x, y) > 1 (and thus P(x, y) = 0): if there exists j € {0, . . . , n} such that y g 5 J , then the 



r.h.s. of © becomes ± (P(y, y) - £\ P((y_i, <), y) J = 0, from Lemma 3.4 if y g U"=o 

then consider a profile z E S^, for some j = 1, . . . , n, such that z differs from y only in the 
strategy of player k: if no such profile exists, then the r.h.s. of (|5]> is 0; otherwise, let us assume 
w.l.o.g. Zfc = Xk (the case Zk = s* k can be managed similarly), then the profile z' = (z_fe, st) is 
a neighbor of y, belongs to the set S^ +1 and P(z, y) = P(z', y): hence, this two profiles delete 
each other in the r.h.s. of ([5]), giving the aimed result. 

2. Let x, y differ in the strategy adopted by the player k: if there exists j E {0, . . . , n} such that y E 
Si , then the r.h.s. of (|5]) becomes P(y, y) — X^fc P((y~i> s i0> y) = P( x i y)> from Lemma 3.4 
if y ^ Uj=o th 611 ' as above, all profiles in Uj=o ^ tnat differ from y only in one player i ^ k 
delete each other in the r.h.s. of ([5]): thus, the only element that survives in the r.h.s. of Q is 

P((x_&,Xfc),y) = P(x,y). 

3. If x = y, then the r.h.s. of ([5]) becomes Yli^k P((y-i> s t )> y) = P( x > x )' from Lemma 



3.4 



□ 



Since, as observed above, logit dynamics for potential games defines a reversible Markov chain, 
Theorems 3.2 and |3.5| imply that the last eigenvalue of the logit dynamics for these games is exactly 0. 



(Note that in Q is only stated the last eigenvalue is non-negative.) Moreover, from the proof above, it 
turns out that an eigenvector of such zero eigenvalue is given by the function / : S — > R defined as 



/(w) 

where the sets S^'s are defined as in the above proof from some fixed profile x. 



— 1 , if w E S^ and j is even; 
1 , if w E S^ and j is odd; 
k , otherwise; 
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4 Metastability of potential games 



In this section we will present our main results about metastable distributions of the logit dynamics for 
ra-player potential games. Specifically, we will show that for any starting profile there is a distribution 
which is metastable for a number of steps super-polynomial in n and whose pseudo-mixing time is 
polynomial for a rich set of starting profiles and any f3, and in general for values of (3 small enough. 

Throughout this section we will assume that the maximal number m of strategies available to a 
player is at most a polynomial in n. We can easily drop this assumption by asking for results that are 
asymptotic in log where \S\ is the number of profiles of the game: each one of our proof can be 
rewritten according to this requirement with small changes. Note that having results asymptotic in the 
logarithm of the number of states is a common requirement in Markov chain literature. Moreover, since 
| S\ < m n , this requirement is equivalent to asking for results asymptotic in n and in the logarithm of m. 

In order to prove our main results we use an ad-hoc representation of potential games introduced 
in Section |4.1| The proof then starts by describing the metastable distributions of the logit dynamics 



through Algorithm |4.6| in Section |4T2| We then bound the pseudo-mixing time when the starting profile 
is in the "core" of a metastable distribution (Section 4.3) and when the starting profile is out of the 
"core" (Section 4.4 1. In the latter case, our analysis assumes that (3 is "small" according to the maximum 
difference in potential (details on the technical bound on /3 can be found below). 

For every piece of our proof, we introduce the necessary technical tools first: in particular, the main 
tools adopted in our proofs are represented by Lemma 4.5 and Lemma 4.16 that relate the hitting time 
to the bottleneck ratio, and Corollary 4.10 that, instead, relates the pseudo-mixing time with the mixing 
time of certain restricted chains. 



4.1 Potential function classes 

Let Q be a n-player potential game: for every j > 0, we let be a potential function of Q in the 
instance in which the number of players is j. A potential function class $ is a sequence {&^}j<=j 
of potential functions : — > M, where S^' is the set of profiles of the game when the number 
of players is j, and J is a subset of N called the index set of A potential game is different from a 
potential function class since for some value of j there might be more than one potential value. However, 
it is not hard to see that we can represent a game as a set of potential function classes: roughly speaking, 
instead of seeing the game "horizontally", that is as a sequence of sets of potential functions, where each 
set contains all functions defined for some specified number of players, we see the game "vertically", 
that is as a set of sequences of potential functions, where each sequence contains at most one potential 
value for each number of players. (The choices of the index sets for all the potential function classes 
defined upon a potential game guarantee that in each sequence at most one potential function is defined 
for each number of players.) Note that for every game, there are many different representations of the 
game itself as set of potential function classes: however, we will prove results that hold for any potential 
function class and hence they hold for any game regardless of the specific representation adopted. 

Our approach to prove metastability of potential games relies on the asymptotic properties of such 
potential function classes: more specifically, we will analyze the asymptotic behavior of the bottleneck 
ratio of subsets of profiles. This approach introduces two issues: (i) potential function classes with a 
finite index set J are not relevant asymptotically as we can always define a constant for the asymptotic 
notations greater than the maximum in J; (ii) bottleneck ratios of subsets of states may not be identified 
asymptotically in n. To address the first issue, in the sequel we restrict our attention to potential function 
classes for which J is an infinite subset of natural numbers. To overcome the second issue, we consider 
potential function classes to which a kind of oracle is attached that distinguishes between polynomial and 
super-polynomial bottleneck ratios: we eventually show that this oracle exists for any potential function 
class (see Lemma |4~Tj ). 
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Formally, given a potential function class 3> = consider a sequence 



.CO 

*2 ) — ) 

A^y.x} I , where each A^f' is a subset of and £(j) > for infinite values of j. For fixed j3 > 0, 

we say that this sequence is a bottleneck class B for the pair <&) if there are functions /b and such 
that for every n <E J,n > no = no(/3, 3>) and 1 < i < £(n), we have /^(n) < B(A^) < (7g 1 (n). A 
bottleneck class B is asymptotically well-defined (AWD, for short) if it is either the case that /b and g& 
are both at most polynomials in n or they are both at least super-polynomials in n. In these cases, we 
say that /b and g& have the same magnitude. Roughly speaking, an AWD bottleneck class contains all 
the subsets of profiles for which the corresponding bottleneck ratios behave "similarly" as n increases, 
where the degree of similarity depends on how close /b and ^b are. 

A potential function class <1? = {$(^} je j is asymptotically well-defined (AWD, for short) if for 
every j3 > and n G J, with n > no = no(/3, we can map each subset of to an AWD bottleneck 
class for (/?, <&), i.e., each set of profiles is assigned to a bottleneck class B = |{^4^ n \ A^\ ■ ■ ■ , 

^t(n)}} ' wnere eacn * s a surj set of and /b and g& have the same magnitude. 

The following lemma shows that any potential function class $ is actually an AWD potential func- 
tion class. 

Lemma 4.1. Every potential function class $ = {&^}jeJ i s an AWD potential function class. 

Proof. Fix /3 > 0. We will prove that there exists no such that for any subset of S^ n \ with n > no, 
we can provide an AWD bottleneck class for (/?, <&) to which the subset belongs, i.e., we can provide 
two functions / and g, of the same magnitude, which sandwich the inverse of the bottleneck ratio of the 
subset of interest. The proof of the existence of no is achieved by building (possibly, infinitely) many 
AWD bottleneck classes Bo, Bi, B2, . . ., and showing that for each one of these classes functions fi, gi 
and a constant n exist. We then set no = maxj n . 

Given the AWD bottleneck classes Bo, . • • , Bj for i > 0, we let Di be the set of profiles which are 
not included in any of Bo, . . . , Bj. We next define the first two AWD bottleneck classes Bo and Bi. 

We let jo = minj J. The first AWD bottleneck class Bo contains the sets S^ n \ for n 6 J, n > n[j = 
jo. Each one of these sets has bottleneck ratio and hence setting go as an arbitrary super-polynomial 
and /2 = 00 is sufficient. Note also that this bottleneck class cannot contain all subsets of profiles 
(i.e., Do 7^ 0): indeed, for each n G J consider the profile x that maximizes the potential function and 
observe that the bottleneck ratio of the set of profiles R = {x} is 



B(R)= Y, P ( x <y) = ^E E ^|x) = ^(l-a^|x)) 

i 



y : H(x,y)=l i s^eSi 




Hence, 1/ B(R) will be always less than any super-polynomial. Then, we build Bi by setting f\ as any 
polynomial, g\ = and ng = jo- 

Given the above definition for functions /o, go, fi and g\, there might be subsets of profiles which are 
left out of the AWD bottleneck classes Bo and Bi, that is, D\ might be non-empty. (A possible scenario 
is depicted in Figure [T]) The next claim shows that we can iteratively define the AWD bottleneck classes 
B2, . . . , Bj to eventually have Di = 0, for some i. 

Claim 4.2. If ' D{ 7^ 0, then there exists an AWD bottleneck class Bj + i such that Di + \ = or Di + \ C 
Di. 
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Figure 1 : For each value of n G J we plot the values of the inverse of the bottleneck ratio of any non- 
empty subset of S^ n \ Given Bo, Bi, if the set of unassigned points D\ has not a maximum then we 
define a new AWD bottleneck class B2 around a non-decreasing function p that goes through an infinite 
sequence of points (these points are colored in gray in the picture). It is necessary to define another 
AWD bottleneck class to assign the currently uncovered points, pictured as open circles. 

Proof. If Di admits a maximum value then let A C be the set of profiles achieving the maximum 
in Di. We fix a polynomial q such that \/q(n) < B(A) for each n > k and we set fi + \ = q, gi + i = 
and re +1 = k. We then have A+i = 0- 

If Di has not a maximum then there exists an infinitely long sequence of previously unassigned 
subsets, at most one for each number of players, such that the inverses of their bottleneck ratios belong 
to a non-decreasing function p. We build a new AWD bottleneck class Bj + i around the function p. We 
focus on the growth rate of p: it will be either at most polynomial or at least super-polynomial. In both 
cases, there exists a constant n +1 and two functions fi + \ and gi + \ of the same magnitude such that for 
every n > n Q +1 , #j + i(n) < p(n) < (see Figure[T]). Since Bj+i includes (at least) all the points 

of p which were previously unassigned, then we can conclude that Di D Di + \. □ 

This concludes the proof J^J □ 

Observe that given an AWD potential function class <fr, > 0, and a set A^ C for some 
n > no, we can express the bottleneck ratio of A^ as an asymptotic function of n, through the functions 
/ and g of the AWD bottleneck class for (/3, 3>) to which A^ belongs. Therefore, in the sequel we will 
write that B(A^) = e ~ 0< - loen ^ when A^ belongs to the AWD bottleneck class B and both / B and 
5b are l ess or equal than a polynomial function in n. Similarly, we say that B(A^) = e"^^ 08 ™) when 
both /b and <7b are super-polynomial functions in n. 

4.2 Metastable distributions 

Let us start by showing that for any AWD potential function class we can find distributions that are 
metastable for a number of steps super-polynomial in n. 

4 Note that, throughout the proof, we do some arbitrary choices (as, for example, the choice of the infinite sequence defining 
the function p): any such choice can give rise to different AWD bottleneck classes and, possibly, to different metastable 
distributions. Moreover, these choices will influence the kind of polynomial needed for quick convergence and the sort of 
super-polynomial controlling the stability of the distributions. 
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Theorem 4.3. Let $ be an AWD potential function class. Then, for every /3 > 0, every e > 0, and 
every n large enough, there exist a distribution ti and a function Tin) super-polynomial in n such that 
fi is (e, e ■ T (n))-metastable for the logit dynamics on 6 3>. Moreover, if more than one of these 
distributions exist, also their convex combination is (e, £ ■ T(n))-metastable. 

In addition to proving the theorem above, the definitions and lemmata given in this section will be 
useful also in the following. 



4.2.1 Technical preliminaries 

For a potential function <f> over a profile space S and a rationality level /3, let P be the transition matrix 
of the Markov chain defined by the logit dynamics on <3?. For a non-empty L C S, we denote with P^ 
the matrix 

^(x,y) = (f x ' y> ,f th x ' y6L; (6) 

I otherwise . 

Let \[ > \% > . . . > Auji be the eigenvalues of P-^: notice that Af can be different from 1 since 



the matrix P^ is not stochastic. Theorem 



3.2 



implies that Af > A^ > . . . > \h, > 0, and thus for 
^max> me largest eigenvalue of P^ in absolute value we have: A^, ax = maxj |Af| = Af . We next give 
a characterization of 1 — A^ ax in terms of bottleneck ratio as from the following lemma (that is an easy 
extension of the similar characterization of the spectral gap of stochastic matrices). 

Lemma 4.4. For finite (3 and any ^ L C S, 1 — A^ ax < B(L) . 

Proof. Define the function ipi, : S — > [0, 1] to be such that </?l(x) = 7r(L) if x 6 L, and </?l(x) = 
otherwise. Consider now the function 

£p(<pl) ■■=\Y, 7r ( x ) p ( x > y)(^W - My)) 2 ■ (?) 



By Theorem 2.2 vr(L) ^ and then = 7r(L) 3 7^ 0. Moreover, by denoting with dL the set of 

profiles x E L that have at least one neighbor profile in S \ L and with E(A±, A2) the pairs of neighbor 
profiles (x, y) such that x G A\ and y € ^2- We have: 

^p(Vl) = E vr(x) J P(x,y)+ ]T vr(x)P(x,y) 

\(x,y)6E(i,S\L) (x,y)6S(S\i,X) 

= vr(L) 2 £ Tr(x) £ P(x, y) = vr(L) 2 Q(L, S\L) , 

xe9L yeS\L-. 

H(x,y)=i 

where we used the reversibility of P in the penultimate equality. Hence, we have ff ^2 1 = B(L). The 

in Appendix). □ 



Lemma follows since 1 — A^ ax < fffc^i (see Lemma 

[tp L \ 



B.l 



The above represents the main ingredient to prove the following relation between bottleneck ratio 
and hitting time. We recall that for L C S, L ^ 0, we let t S \ l denote the random variable whose value 
is the first time that the logit dynamics for a potential function $ hits a profile in 5 \ L. 

Lemma 4.5. Let & be a potential function with profile space S and let P be the transition matrix of the 
logit dynamics for <I>. Then for finite f3 and L C S, L 7^ 0, we have 

„ , s B(L) 

minP x (^<t)<t. r -A_L ) . 
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Proof. We observe: 



minP x (t sxl < t) = 1 - maxP x (t t > t 



xeL 



xeL 



(see Theorem B.2 1 



miax 



< 1 — exp yt log A 
= l-exp ftlog(l-(l-A max )) 



(since 1 — a > e 1 ~ a ) 



I 1 - \ L 

< 1 - exp -t- max 



(by Lemma 4.4 1 
(since 1 — e~~ a < a) 



< 1 — exp —t ■ 



"max / 

B(L) 
1 - B(L) 



< t • 



l-B(L)- 



□ 



4.2.2 Description of the metastable distributions 

For any fixed n, in order to establish the metastable behavior of the logit dynamics for a function $ = 
$( n ) in an AWD potential function class we define subsets R\,...,R^ of the strategy profile set 
S = S^ n \ with k = k(n) > 1: these subsets are the supports of the metastable distributions of the 
logit dynamics for <I>. Moreover, we partition S in k + 1 subsets Ti, . . . , Tf. and N. Roughly speaking, 
T\, . . . ,Tfc represent the "core" of the sets R\ , . . . , R^, i.e. Ti contains profiles of Ri from which it is 
"hard" to leave Ri, the last subset N simply contains the remaining profiles of S. 

The sets R±, . . . , R^, Ti, . . . , and N are built according to the following procedure. The proce- 
dure works its way through subsets of S by finding subsets of profiles that act as a bottleneck for the 
Markov chain. The algorithm takes in input an AWD potential function class a rationality level /3, a 
constant 5 > and n large enough. 

Algorithm 4.6. Set N = and i = 1. Until there is a set L C N such that B(L) = e^( logn ', do: 

1. Denote with Ri one such subset with the smallest stationary probability; 

2. Choose a polynomial 

3. Denote with Ti the largest subset of Ri such that for every y £ T, 

P y [rs\ Ri < Vi{n)) < 5; 

4. Delete from N all profiles contained in T and increase i. 

Let us make a number of observations about the algorithm above. First, if there is a disconnected 
set L with super-polynomially small bottleneck ratio, then each connected component of L will have 
bottleneck ratio the inverse of a super-polynomial and smaller stationary probability: hence, the set Ri 
returned by the algorithm will be connected. Moreover, we notice that the algorithm above enters at 
least once in the loop (and thus at least a subset Ri is returned) since B{S) = 0. Then we notice that, 
from Lemma 4.5 for every 5 > in input, every Ri output of the first step and every Vi(-) chosen in 



the second step, the third step in the loop returns a non-empty Tf. indeed, Lemma [43] implies that there 
exists at least one y € Ri such that, 



P y {r S \ Ri < Vi{n)) < 



Vi{n) ■ B{Rj 
1 - B(Ri) 



Viin) 

2 w(logn) _ -y 
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where the last step holds for n sufficiently large. Finally, we observe that at the end of the algorithm 
for every x € N there does not exist a subset L C N con taining x for which B{L) = e~ w ( logn ); on 
the contrary, for every subset R{ returned by Algorithm 4.6 B(Ri) is e~ w ( logn ). We can then prove the 
following fact. 

Lemma 4.7. Let <& be an AWD potential function class, fix j3 > and n large enough. Let Rj be one of 



the subsets returned by Algorithm 4.6 on input /3 and nfor some choice of constant 5 in input and of 



the polynomials Vi,- ■■ ,Pj-i in Step 2. Let p,j be the distribution such that p-j{y) = ify£ Rj 

and p>j(y) = otherwise. Then, for every e > 0, fij is (e, e ■ T(n)) -metastable for the logit dynamics 
on T being a super-polynomial function in the input. 



Proof. By Lemma 2.4 fj,j is (B(Rj), l)-metastable. By Lemma 2.5 p,j is also (B{Rj) ■ T,T)- 
metastable, for any T > 1. Given e > 0, let T £ be such that e = B(Rj) • T £ ; we then have that p,j 
is (e, 7;)-metastable where % = e- B{Rj)~ l = e ■ e <^ lo s n ). □ 

Finally, the following lemma shows that a combination of metastable distributions is metastable. 

Lemma 4.8. Let P the transition matrix of a Markov chain with state space S and let [1$ be a distribution 
(ejj 71) -metastable for P, for i = 1,2,.... Set e = maxj e% and T = minj{7i}. Then, the distribution 
\x = 2~2i a ili"i> w tih 2^2i Q i = 1 an d ol{ > 0, is (e, T) -metastable. 

Proof. For every t < T we have 

H/iP* - M || TV = max l^P'XA) - fi(A)\ 



max 

ACS 



i 

— ai maX 

i 

4.3 Pseudo-mixing time starting from Tj 

In this section, we will prove that the logit dynamics for an AWD potential function class and any /3 
converges in polynomial time to a metastable distribution, whenever the starting point is selected from 
the "core" Tj of this distribution. 

4.3.1 Technical preliminaries 

Let $ be a potential function on profile space S. Let P be the transition matrix of the logit dynamics 
on and let it be the corresponding stationary distribution. For L C S non-empty, we define a Markov 
chain with state space L and transition matrix Pl defined as follows. 

_J>(x,y) ifx/y; 
P L (x, y)-< 1 _ £ zei; p(X) z) = p(Xj x) + ^ p(X) z) oth e rw i S e . ( 8 ) 

I z^x 



It easy to check that the stationary distribution of this Markov chain is given by the distribution ttl (x) = 
^7jy, for every x G L. Note also that the Markov chain defined upon Pl is reversible and aperiodic, 
since the Markov chain defined upon P is, and it will be irreducible if L is a connected set (as Ri is). 
Moreover, it is immediate to s ee that Pi is also a nice restriction of P and hence all its eigenvalues are 



non-negative by Theorem 3.2 Sometimes, we slightly abuse the notation and denote with Pl and ttl 



also the Markov chain and the distribution defined on the entire state space S, assuming Pl(x, y) = 
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if x ^ L or y ^ L, and similarly 7Tl(x) = when x ^ L: reversibility and non-negativeness of 
eigenvalues continue to hold also in this case. 

For L C S we set dL as the border of L, that is the set of profiles in L with at least a neighbor in 
S \ L. Recall that t S \ l is the random variable denoting the first time the Markov chain with transition 
matrix P hits a profile x G S 1 \ L. The following lemma formally proves the intuitive fact that, by 
starting from a profile in L the chain P and the chain Pl are the same up to the time in which the former 
chain hits a profile in S \ L. The proof uses the well-known coupling technique (cf., e.g., |[T9l ) which is 
summarized in Appendix [D] 

Lemma 4.9. Let P be the transition matrix of a Markov chain with state space S and let Pi be the 
restriction of P to L C S, L ^ 0, as given in ([8]). Then, for every x G L and for every t > 0, 

P*(x,-)- J Pl(x,-)|| Tv <P x (r SXL <t) . 

Proof. Consider the following coupling (Xt, Yt)t>o of the Markov chains with transition matrix P and 
Pl, respectively: 

• If Xj = Yj, G L \ dL, then we update the first chain according to P and obtain Xi+i; we then set 
Yi+i = Xi + i, 

• If Xi = Yj, G dL, then we update the first chain according to P: if X{+i G L, then we set 

li+i = -Xj+i, otherwise we set Yi+i = Yf, 

• If Xj ^ Yi, then we update the chains independently. 



Since Xq = Yq = x G L, we have that X 4 / Y t only if t§\l < t. Thus, by the properties of couplings 

< P x (X t ± Y t ) < P x (r SVL < t) . 



(see Theorem |D.1| ), we have 

p*(x,-)-pi(x, 



□ 



TV 



The following corollary follows from the Lemma 4.9 and the triangle inequality property of the total 
variation distance. 

Corollary 4.10. Let P the transition matrix of a Markov chain with state space S and let Pl be the 
restriction of P to a non-empty L C S as given in ([8]>. Then, for every x G L and for every t > 0, 



TV 



< 



P£(x,-)-tt l +P x (r SXL <t). 



TV 



4.3.2 Pseudo-mixing time 



Corollary 4. 10 gives us a tool to prove a bound on the pseudo-mixing time of the logit dynamics in terms 
of the mixing time of the following Markov chains. 

Let TZf be the Markov chain with vertex set Ri (as defined by Algorithm 4.6 on inpuj^an AWD 
potential function class a rationality level /3 and n) and transition matrix Pr x as defined in ([8]). 
(Whenever, ft and $ are understood we simply write TZi.) We begin by studying the asymptotic mixing 
time of this chain as a function of the number of players n. Note that we are slightly departing from 
the usual mixing time bounds given in terms of the logarithm of the number of states. Here, the growth 



5 Note that we do not need to specify the choice of the constant 8 and of the polynomials Vi, 
the properties we need for any such choice. 



,Vi-\ \ the set Ri has all 
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parameter of interest is n; given $ and /3, n defines Ri via Algorithm 4.6 To stress the dependence of 
IZi from n, we will say that IZi has length nj^] 

We initially prove the following fact on nodes of the chain IZi. 

Lemma 4.11. For any (3, and for every two profiles x, y of Ri we have |<3?( n )(x) — $^(y)| < ^jr, 
f{n) = O(logn), n being the length o/7£* . 

Proof. Assume by contradiction that the claim is false. That is, there exist /3, and a subsets of states 
Ri- of Ri such that for every y G Ri- we have 5>( n )(y) — <I>( n )(x*) = ^('"g") ( w here x* is the profile 
that has the minimum potential value among profiles in R4. Let us define Ri + = Ri\ Ri-. Notice that 
every y G Ri+ we have <J>( n )(y) — <3?( n )(x*) = °^°^ n "> _ Moreover, observe that Ri + is not empty since 
x* belongs to it. Finally, for x E Ri+, y E Ri-, we have vr(x) > vr(y) since 7r(x) is proportional to 
e -0*W(x) and ^(y) is proportional to e^*'*^, while $( n )(y) > $( n )(x) by definition of R i+ and 

Now, we consider two cases depending on how the ratio ^r0\ evolves as n grows. 
If 5^7) >e H{n) ,H(x) = w(logx): Since tt (i^) = vr (i^ + ) + vr we have 

< e- g(w) -7r(i?i) = e- H(w) • vr (ifr) e^g = (logn) 

7r(i?i_)- 7r(i?i_) tt {Ri) - ir (Ri+) ~ l- e -"W 

Moreover, observe that for each profile x G there are at most m • n neighbors in since 

m < poly (n) we can write m ■ n = e°( logn \ By denoting with d^Ri- the set of profiles in that 
have at least a neighbor in Ri + , we have 

*{d R Ri-)< E E E <*) = e°^ n \{R l+ ), 

xeRi+ ye-Ri-nAr(x) xei? i+ yei?i_niV(x) 

where as observed above we use the fact that the definition of Ri-, Ri + yields vr(y) < 7r(x). It follows 
that 

Q (Rj-,Rj+) < TT (dfiRi-) _ w (io g n) O(logn) _ -w(logn) 

?r(12j_) " vr {R4— ) 

g {Ri-,S \ Ri) < / ^(logn.)^ 1 Q {Ri-,S \ Ri) 
-;/.', ) V / - •; /,', ; 

< A 1 e -w(logn)\ _1 _ Q C^i) & \ Ri) _ g -w(logn) 
~ V / 7r(i2i) 

where the last inequality holds since i3 {R4) = e _w ( logn ) or otherwise Ri would not have been returned 
by the algorithm. Thus, 

ri l u \ Q {Ri-,Ri+) . Q {Ri-, S \ Ri) -w(logn) 

B{R '- )= VW) + n{R t -) = 6 



Finally, since tt {Ri) = tt {Ri-) f 1 + j and as Ri- C Ri, we have 



6 There can be values of n for which the algorithm does not run the i-th iteration and thus TZi is not well defined. However, 
as long as there are infinite values of n for which Ri is computed then asymptotic bounds on the mixing time of IZi are 
well defined. Since the algorithm executes at least one iteration for any input, we have that there exists no such that for 
i < max„>„ k(n), Ri is computed infinite times. 
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If ^g^y < e Li ~ n \ L(x) = O(logx): Since B (Ri) = e'^ 1 ^^, or otherwise R { was not returned by 
the algorithm, and as R{ + C Ri, we have 

Q (Rj+, S \ Ri) ^ L ( n ) Q (Rj, S \ Ri) _ w (i og „) 
7r(Ri + ) ~ ' ir(Ri) 

Furthermore, observe that for each x € and y 6 such that x and y differ only in the 
strategy played by the player j, we have 

1 e /3%(y) _ i i 

(X ' y) " ne^(x)+^ z6 ^ (x) e^-W " n e /3(*(y)-<f(x)) + £ z£jv . (x) e -/3(*(*)-*(y)) 

111 1 

< 



n e /3(*(y)-*(x)) n efl(#(y)-*(x*))-(#(x)-#(x*))] 



1 _ g-w(l°gn')_ 



exp (w (log n) — (log n) + log n) 
Hence Q(fi /+- R r ) = e -^ logn ) and thus, 

r/j? \ Q(Ri+,Ri~) Q(Ri+,S\Ri) tj(iogn) 

7T (.R i+ ) 7T 

In both cases, we have that there exists a set that is contained in Ri, and hence its stationary probability 
is less than it (Ri), and that has bottleneck ratio the inverse of a super-polynomial: as such, this set must 
be chosen before Ri by Algorithm 4.6 But since in the third step of the algorithm at least one element 
of such sets is deleted from N, as a consequence, we have that Ri cannot be returned by the algorithm, 
thus a contradiction. □ 

Lemma 4.12. For any /?, the mixing time t m i x ofTZf'^ is at most polynomial in its length n. 

Proof. For a generic set L, let us denote with Bl(A) the bottleneck ratio of A C L in the Markov chain 
with state space L and transition matrix Pi,. We will show that 

min B R (A) = e -°^ osn \ 

AcRi-. tt J j.(A)<1/2 1 



Suppose, by contradiction, that there exists A+ C Ri with itr x (A±) < 1/2 such that Br.(A+ 



-uQogn) We distinguish two cases on the value of Q( A *' S \ R ^ 
If Q(A+ S\Ri) = - w (logn). We haye 

= Q (A*, S\A*) = Q (A*, Rj \ AQ Q(A,,S\Ri) 
1 * j vr(^) vr(A) vr(A) 



vr (A) vr (A) 

ExgA, E y gRAA,^( x )^( x 'y) , Q(A*,S\ift) _ p Q(A^S\Ri 
n Rl {A±) + vr(A) -- B *W+ 7r (A) 



Hence, 5(A) = e-^( logn ) 



If g^jVgi) = e -0(logn). Consider ^ = ^ \ A. Note that + ggffi) = B (R>) < 



e w ( lo s n ) ; otherwise Ri was not returned by the algorithm. Hence, we obtain 

Q (A*, S\Ri)< e~^ n \ (Ri) and Q (A\, S\R t )< e~^ n K (Ri) . 
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From the first of these inequalities and the hypothesis on ^~7^¥^^ . we have 
Hence < e -^( lo g«) ^ll^l = e -^ Xo ^B Ri (A*) < e -^ logn ). Then we obtain 



Q(A*,S\A*) Q(A*,A*) Q(A it ,S\R i ) 
-d (^Ak J = = -rm — + 



(by reversibility of P) = — — — r + 



vr (120 " 7T (A*) vr (12;) - vr (A*) 
-w(logn) -w(logn) 

< _ l _ < e -w(iogn) 

— 1 _ g-w(logn) ]_ _ g-oj(logn) — 



In both cases we have that for every n there exists a set that is contained in 12;, and hence its stationary 
probability is less than ir (Ri), and that has bottleneck ratio the inverse of a super-polynomial: as such, 



this set must be chosen before 12; by Algorithm 4.6 But since in the third step of the algorithm at least 
one element of such sets is deleted from N, as a consequence, we have that 12; cannot be returned by 
the algorithm, thus a contradiction. 

Hence, min{B (A) | A C Ri : vr (A) < 1/2} = e^ n ), and then from Theorem 
properties of the relaxation time (see Theorems 

i™, < 2 • e°( lo s") • lo. 



C.2 



and 



3.2 



and the 



A. 1} it follows that TZ i has mixing time 

- p O(logn) 



min^^ ttr^x) 



where last equality follows from the fact that by Lemma 4.11 and the assumption m = poly (n), 

p-/3*max 



> 



1 



3 n log n+O (log n) 



with ^max and <£ m i n denoting the maximum and minimum potential overall possible strategy profiles in 
12 j, respectively. □ 



4.12 



Given 3>, /3, let tm*(n) be the mixing time t m i x (e) of the chain TZi of length n. Since by Lemma 
tm|(n) is a polynomial in n for e > po |y( ra ) , then it is admissible to set Vi(n) = tm|(n) in 



Atg6rithm|4.6| Hence using Corollary 4.10 we can prove the following theorem. 



Theorem 4.13. Let <1? be an AWD potential function class and consider the logit dynamics for For 
every j3,S,e > po |y( n ) an d n large enough, if Ri and T{ are the sets returned at the i-th iteration of 



Algorithm 4.6 on input /3, 5 and n, with Vj(n) = tm^n), for j < i, then the pseudo-mixing time 



tj}. (e + 5) of p,i from is polynomial in n, where m is defined as above. 



Proof. Using Corollary 4.10 the definition of T, and the fact that tm*(n) is the mixing time t m i x (e) of 
TZi, for x € Ti we obtain 



ptmi(n)( X) .)_ /i . 



TV 



< e + 5 



t%(e + 5)<tmi(n), 



where the implication follows from the definition of pseudo-mixing time tT*(-) 



□ 
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4.4 Pseudo-mixing time starting from iV for "small" (3 



Let N^ ,/3 be the set of states A at the end of the execution of Algorithm 4.6 on input 3>, f3 and n. (Once 
again, we do not need to fix the choice of 5 and of the polynomials the algorithm uses as A*'^ enjoys 
the properties we need for any such choice.) As for the Markov chains we stress the dependence 

of iV*'^ from n by calling n the length of A*'^ and simply write A when $ and (3 are understood. Let 
us denote with A(-) the function that, for every n, gives the Lipschitz constant of = <3?( ra ) G i.e., 

A(n) := max{$(x) - $(y) : #(x, y) = 1} . 

In this section, we prove the following theorem. 

Theorem 4.14. Let <& be an AWD potential function class and consider the logit dynamics for For 
every Q(x) = poly (x) and for every < f3 < ^fey, £ > 1/poly (n) and every x G N®'@ there exists 
a distribution u x metastable for a super-polynomial in n number of steps, n being the length of A* 1 ' 3 . 
Furthermore, the pseudo-mixing time tj^(3s) of u x from the profile x is polynomial in n. 

We next define the metastable distribution u x , x G N Consider the distributions m defined above 
(i.e., the stationary distribution restricted to Ri output of the algorithm). We focus here on distributions 
of the form 

f(y) = ^2^i(y) , 

i 

for ai > and J2i on = 1. Notice that, the distribution v is a convex combination of distributions that 



are metastable for super-polynomial time: thus, from Lemma 4.8 each such v is metastable for super- 
polynomial time in n. To prove fast convergence to some v starting from x G A*'^ we need to fix the 
values of the aj's. Towards this end, fix (3 and n, and consider the hitting time t s \n of S \ A of the 
logit dynamics for <3? = <3>( n ). For x G A and < e < 1, we also let Tg^ N (x) be the first time step t in 

which P x (t S \ N > t) < e. For every n and every profile x G A*'^, we then define the distribution 

^x(y) = Y, A*(y) • p * { x s\n e T * I < 7s\jv( x )) . (9) 

Observe that by definition of t s \n, since the Tj's and A are a partition of 5, X Ts . N G UjTj is a certain 
event for all values of t s \n- Moreover, we show below that we can condition on the event t S \ N < 

Ts\ N (x). Thus, J2i p x (^r SVJV G Tj | r S \jv < 7|^(x)j = 1. Observe that by the definitions of r S \jv, 
Tj's and A, A TsXJV G UjTj is a certain event for all values of t s \ n . Moreover, we show below that we 
can condition on the event t S \ N < Tg\ N (x). The above is then a valid definition of the on's. 



To prove Theorem 4.14 we give another useful relation between bottleneck ratio and hitting time, 
whose proof follows from characterizations of hitting time and bottleneck ratio in term of the largest 
eigenvalue of a suitably chosen matrix. 



4.4.1 Technical tools 

We give another useful relation between bottleneck ratio and hitting time. Before that we recall a char- 
acterization in terms of bottleneck ratio of 1 — A^ ax , the largest eigenvalue in absolute value of the 
matrix P-^ defined in ([6]): the proof of this characterization is exactly the same as a similar well-known 
characterization for the spectral gap of stochastic matrices (see, for example, Section 13.3.3 in lfT9l ). 

Lemma 4.15. For any$^L^S,l- A^ ax > . 



20 



Lemma 4.16. Let <3? be a potential function with profile space S and P be the transition matrix of the 
logit dynamics for <&. For /3>0, I/Ic5,x£l and < e < 1, let 7g^(x) be defined as above 
(with respect to P). Then 



L . 2 (2{l-s) 



+ log 



7Tl(x)/ 



where vr^(x) = ^i^l a«c? = min acl : B(A). 

7r(A)<i/2 

Proof. It is known that the hitting time of S \ L can be expressed as a function of the eigenvalues of the 
matrix P^ (see Theorem B.3 ). In particular, we have 



P x ( r, , ;> / ) £ exp ( * log A^ ax + - log — ^ 



(since 1 - a < e a ) < exp -t 1 - A max + - log 

V / 2 7Tl(xJ 

1 



(by Lemma 4. 15 1 < exp 



1. 1 

>£ 

1 



i(#ff - log 

2 \ 7T L (x) 



1 



(since e~ a < (1 + a)" 1 ) <(!+-( ) 2 - lo. 



1 



vtl(x) 



-1 



M_2 f 2(l-e) 



Thus, by setting t = (B*)~ 
upper bounded by this value of t. 



+ lo S ^iW )' We haVe Px ( Ts \ L > *) - 6 and then Ts\l( x ) is 



□ 



4.4.2 Pseudo-mixing time 



The main ingredient in the proof of Theorem 4.14 is abound to the hitting time t s \ n : from Lemma 4.16 
for every e > po |y( n ) we have that, 



7|\iv(x) < (B 



2(1 



+ log 



VTATlXi 



3 0(logn) 



where the last equality holds since, by definition of Algorithm |4.6| every subset of N has bottleneck ratio 
at least the inverse of a polynomial, and since, log(l/7rjv(x)) = 0(f3nA(n) + poly (n)) = poly (n) 
which follows from the upper bound on fj, m being polynomial in n and A(n) being the Lipschitz 
constant of the potential function. 

For every x G N and e > l/poly(n), let us fix t* = Tg\ N (x.) + maxjt^(e): notice that, by 



Theorem |4.13[ t* is a polynomial function in n. For sake of readability , let us denote with E the event 
"Ts\jv < ^5\at( x )" and with ^ hs complement. Recall from Definition 



the Markov chain defined by logit dynamics at step t and observe that 



2.1 



that Xt denotes the state of 



= max |P X (X t * eA)- v x {A)\ 
TV AGS 

= max |P X (X t * £ A A E) — u x (A) + P x (X t * e A AE) 



max|P x (X t * G A \ E) (I 



Px (E) ) - !/ x (A) + P x (X t * G A | E) P x (E) | 



< max |P X (X t * G A | E) - is x {A)\ + P x (E) 

< ||P x (X t * | E)-u x \\ Ty + e, 
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where the definition of Tg\ N (x) implies that P x (E) > 1 — e > and then yields the third equality and 
last inequality. The penultimate inequality, instead, simply follows from the subadditivity of the absolute 
value and the fact that the difference between two probabilities is upper bounded by 1. As every /ij is 
metastable for a super-polynomial number of steps, we have, by using r* as a shorthand for t S \ n , 



|Px(Xt* I E) 



xllTV 



i yen 



(X T * = y | E) ■ P x {Xp. | X T * = y A E) 



TV 



< 



EE? 

i yeTi 

^EE*>* 



(X T , 



E) (P**-^(y,- 



t'i 



(X T , = y\E) 



p t* 



(y,-) -im 



TV 



TV 

< 2e, 



where the definition of t* yields X T * G T,, for some i, which in turns yields the first equality by the law 
of total probability. In the first inequality above, instead, we use the definition of i/ x and the fact that 
by definition of t*, E implies t* — r* > t* — 7g\ N (x) = maxj tj^(e); the second inequality follows 
from a simple union bound; and the last inequality follows from Lemma 2.7 (note that t* — t* satisfies 
the hypothesis of the lemma: the lower bound is showed above, while the upper bound follows from 
the fact that the /ij's are metastable for super-polynomial time). Hence, we have for every x G N, 
tl* } (3e) <t* = poly(n). 



4.5 The main result 



We obtain our main result as a corollary of Theorems 4.3 4.13 and 4.14 and of the observations done in 
Section O 



Theorem 4.17. Let Q be an n-player potential game. Let A(j) be the function returning the maximum 
Lipschitz constant between the potential functions of the game when the number of players is j. Then, 
for every < /3 < P A(ff > £ — 1/P°'y ( n ) an d ever 7 starting profile x, there exists a distribution that 
is metastable for a number of step super-polynomial in n and whose pseudo-mixing time is polynomial 



in n. Additionally, if the starting profile is in some T\ defined by Algorithm 4.6 on input some AWD 
potential function class defined upon Q then the result above holds for any j3. 



5 An application: the Curie- Weiss model 



In the previous section we adopted Algorithm 4.6 to find the metastable distributions of a generic poten- 
tial game. However, the algorithm looks to be unpractical since it does not allow to explicitly define the 
metastable distributions. Hence, since we know that such distributions exist it is natural to ask how we 
can find a more explicit description of metastable distributions for specific games. We will show in this 
section that our ideas can be used to this aim: specifically, we apply these ideas to solve a problem left 
open in O. 

Consider the following game-theoretic formulation of the well-studied Curie-Weiss model (the Ising 
model on the complete graph), that we will call C\N-game: each one of n players has two strategies, — 1 
and +1, and the utility of player % at profile x = (xi, . . . , x n ) G { — 1, +1}™ is ttj(x) = Xi Xj. 
Observe that for every player % it holds that 

Ui(x-i, +1) - Ui(x-i, -1) = H(x-i, -1) - %(x_j, +1) , 

where %(x) = — Ylj^k x j x k> hence the CW-game is a potential game with potential function %. The 
magnetization of x is defined as S(x) = a^. 
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It is well-known (see, for example, Chapter 15 in |[T9ll ) that the logit dynamics for this game (or 
equivalently the Glauber dynamics for the Curie-Weiss model) has mixing time polynomial in n for 
(3 < 1/n and exponential as long as /3 > 1/n. Moreover, O describes metastable distributions for 
(3 > clogn/n and shows that such distributions are quickly reached from profiles where the number 
of +1 (respectively —1) is a sufficiently large majority, namely if the magnetization k is such that 
k 2 > clogn/p. Thus it is left open what happens when j3 lies in the interval (1/n, clogn/n) and if 
a metastable distribution is quickly reached when in the starting point the number of +1 is close to the 
number of —1. The following theorem closes such an open problem, by following an approach similar 
to the one taken in the previous section. 

Theorem 5.1. Let Q be the n-player C\N-game and consider the logit dynamics for Q. Let S+ ( resp., S-) 
be the set of profiles with positive (resp., negative) magnetization and ir + (resp., 7r_) be the restriction 
of the stationary distribution to S+ ( resp., S-). If (3 > 1 /n, then 7r+ and 7r_ are (e, T) -metastable, with 
e > po \y( n -j and T exponential in n. Moreover, for every starting profile the logit dynamics reaches a 
convex combination of these distributions in polynomial time. 

Sketch of proof. The proof consists of three different parts corresponding to the components of the proof 
of our main result: first, we identify ir + and 7r_ as metastable distributions by looking at the bottleneck 



ratio of their support, as done in Lemma 4.7 then, we characterize the "core" of these metastable 



distributions and we bound the mixing time of the chains restricted to S+ and S- when the starting 
point is in the "core", establishing in this way, as in Theorem |4.13[ that the time that the logit dynamics 
takes to reach a metastable distribution from the "core" is polynomial; finally, we will show that the 



bottleneck ratio of profiles "out-of-core" is polynomial and hence, as in Theorem 4.14 follows that the 
dynamics quickly converges to a convex combination of metastable distributions. 

More specifically, the metastability of 7T+ and 7r_ quickly follows from the fact that, for f3 > 1/n 
the bottleneck ratio of S+ and S_ is exponential in n (see, for example, Theorem 15.3 in 1 19]) and from 
Lemma l2~4l 

The "core" of these distribution, i.e., profiles from which the dynamics leaves S + or S~ in poly- 
nomial time with probability at most e, is given by states with magnetization k such that > c//3, 
where c = c(e) is constant. This can be proved by considering the magnetization chain, i.e., the birth 
and death chain on the space {— n, 2 — n, . . . , n — 2, n}, and asking for the hitting time of I < when 
the starting point is k: it is well known that the hitting time in a birth and death chain depends only on 
the ratio between the probability to go back and to go ahead (see, for example, Section 2.5 in lfl9T ). The 
characterization then follows by showing that this ratio is at least constant for any starting profile (see, 
for example, Lemma 4.5 in [6]). 

As for the mixing time of the chains restricted to S + and S~ starting from the "core", we distinguish 
two cases: for f3 = 0(log n/n) we achieve a polynomial bound following the approach adopted by |[T8ll 
for bounding the mixing time of censored chain^J for larger /3, a polynomial bound follows since the 
extremal profiles are hit in polynomial time, as showed by Lemma 4.7 in (6J. 

From above characterization of profiles in the "core", it turns out that the number of remaining 
profiles are at most polynomial; profiles on the boundary are the one that maximizes the stationary prob- 
ability among "out-of-core" profiles; moreover, the probability to leave the subset is grater than po |y( n ) 
since there are always neighbors with a lower potential value. This proves a polynomial bottleneck for 
"out-of-core" profiles and completes the proof of the theorem. □ 



7 The censored chain of 1 18 1 is exactly our restricted chain, except that the probability that the original chain from a profile 
x goes out from L is "reflected" to some profile in L different from x, instead than to be "added" to the probability to do not 
leave x. 
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6 Conclusions and open problems 



In this work we prove that for any potential game and for any profile x there is a distribution which is 
stable for a super-polynomial number of steps and it is quickly reached from x if /3 is small enough. 
For a rich set of profiles x, the pseudo-mixing time is polynomial for any j3. As we mention above, an 
assumption on j3 is necessary because when (3 is high enough logit dynamics roughly behaves as best- 
response dynamics. Moreover, in this case, the only metastable distributions have to be concentrated 
around the set of Nash equilibria. This is because for (3 very high, it is extremely unlikely that a player 
leaves a Nash equilibrium. Then, the hardness results about the convergence of best-response dynamics 
for potential games, cf. e.g. lfl3l . imply that the convergence to metastable distributions for high f3 is 
similarly computationally hard. Interestingly, this difference in the behavior of the logit dynamics for 
different values of /3 suggests that "the more noisy the system is, the more (meta)stable it is." 

Although quick convergence to metastable distributions for high level of rationality is impossible to 
prove for each game, it would be interesting to analyze under which conditions this behavior occurs. 
Finally, note that our result is in a sense existential, since we are not able to explicitly describe the distri- 



butions. It is an interesting open problem to characterize the sets i?j's and Tj's returned by Algorithm 4.6 



for some specific class of games in order to understand better the stability guarantee of the distributions. 
Alternatively, our ideas may turn out to be useful to find different metastable distributions which can be 
explicitly defined and then allow to make predictions about the future. (We give a first example of this 
approach in Section [5]) A better understanding of spectra of the transition matrix along the lines of the 
results we prove may help in answering some of the questions above. 

Naturally, there are other questions of general interest about metastability that we do not consider. 
For example, akin to price of anarchy and price of stability, one may ask what is the performance of a 
system in a metastable distribution? One might also want to investigate metastable behavior of different 
dynamics, such as best-response dynamics. However, in the latter case, no matter what selection rule 
is used to choose which player has to move next, a profile is never visited twice in time since at each 
step the potential goes down. Therefore, the "metastable" behavior of best-response dynamics would 
roughly correspond to an (exponentially long) sequence of profiles visited. This, however, would not 
add much to our understanding of the transient phase of best-response dynamics. 
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A The usefulness of relaxation time 



The relaxation time is related to the mixing time by the following theorem (see, for example, Theorems 
12.3 and 12.4 in El). 



Theorem A.l (Relaxation time). Let P be the transition matrix of a reversible, irreducible, and aperi- 
odic Markov chain with state space S and stationary distribution tt. Then 



(t re l - 1) log 2 < W < log t re l 

.7T m • 



where 7r min = min xe5 7f(x). 



B Hitting time bounds 

Consider a reversible Markov chain with state space S and transition matrix P. For L C S let P^, \f and 
A^,. JV as defined in Section 



'max 

of A max as expressed by the following lemma. 



4.2.1 



Here we give a well known (see, e.g., 11221 ") variational characterization 



Lemma B.l. Consider a reversible Markov chain with state space S, transition matrix P and stationary 
distribution tt. For any L C S we have 



^ . £p(ip) 

1 ~~ -Vax = inf : 



where £p((f) is defined as in Q, E,r 
that <p(x) = Ofor x G S \ L and 



(p 1 = ^ x n(x)(p (x) and the inf is taken over functions ip such 



W 2 ] * o. 

The following theorems relate t S \ L and A max and are already stated in e.g. [22]. 
Theorem B.2. For a reversible Markov chain with state space S, any L C S and any t it holds that 



max P x (t S \ l > t) > exp ( t log A 



L 

max 



Theorem B.3. For a reversible Markov chain with state space S, any L C S and any t it holds that 

P x (r s \ L >t)< exp (t log A max + - log 



71"l(x) 



where vr^(x) has been defined in ([3]). 

Since the statement of Theorem |B.3| is slightly different from the ones found in previous literature, 
we attach a proof for sake of completeness. 

Proof. Let ipi be the characteristic function on L, that is ipz (x) = 1 if x G L and otherwise. Then 

Px (r S \ L >t) = Y, p U^y) = E p r ( x .y)^(y) = (4^)( x ) • ( 10 ) 

yes yes 

Since P^ is reversible with respect to ttl, we have that its eigenvectors, tpx, . . . , ifi\s\> form an orthonor- 
mal basis with respect to the inner product (•, ■} 7TL : in particular we can write ifL = Yli ^^i' where 
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Y^i Oii = 1 and each a- t > 0. Hence and from the linearity of the inner product we have 

(P^l,P^lK =EI>< ( A ?)Vi,^ (Aj)*^ 
* i 

(by orthogonality) = ^ (\f\ (aiipi, a i ip i ) 7TL (H) 



< (^max) (Vlo Vi)^ = (A^^ , 

where the last equality follow from the definition of ipz,. Moreover, 

7r i (x)[(P^ L )(x)] 2 < ^vr L (y)[(P^ L )(y)] 2 = (Pj-^, P^ L ) nL . (12) 
yes 



The theorem follows from ((TO]), {n}, ([12)). □ 



C Bottleneck ratio bounds 

We use the following theorem to derive lower bounds to the mixing time (see, for example, Theorem 7.3 
in HI). 

Theorem C.l (Bottleneck ratio). Let Ai = {X t : t G N} be an irreducible and aperiodic Markov 
chain with finite state space S, transition matrix P, and stationary distribution ir. Let L C S be any set 
with 7r(L) < 1/2. Then the mixing time is 

1 

^mix — 



4B(L) ' 

The bottleneck ratio is also strictly related to the relaxation time. Indeed, let 

B± = min B(L) , 

L: tt(H)<1/2 

then the following theorem holds (see, for example, Theorem 13.14 in [ 19]). 

Theorem C.2. Let P be the transition matrix of a reversible, irreducible, and aperiodic Markov chain 
with state space S. Let A2 be the second largest eigenvalue of P. Then 

Bl 

-f < 1 - A 2 < 2B, . 

D Markov chain coupling 

A coupling of two probability distributions /i and u on a state space S is a pair of random variables 
(X, Y) defined on S x S such that the marginal distribution of X is p, and the marginal distribution of 
Y is v. A coupling of a Markov chain Ai on S with transition matrix P is a process (Xt, Yt)^2. with 
the property that X t and Y t are both Markov chains with transition matrix P. Similarly, a coupling of 
Markov chains M., M. both defined on S with transition matrices P and P, respectively, is a process 
(X t , Yt)t^. with the property that X t is a Markov chain with transition matrix P and Y t is a Markov 
chain with transition matrix P. 

When the two coupled chains start at (Xq, Yq) = (x, y), we write P x ,y (•) for the probability of an 
event on the space S x S. The following theorem, which follows from Proposition 4.7 and Theorem 5.2 
in flj) ] establishes the importance of this tool. 
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Theorem D.l (Coupling). Let M., M. he two Markov chains with finite state space S and transition 
matrices P and P, respectively. For each pair of states x, y £ S consider a coupling (Xt, Yt) of M and 
M. with starting states Xq = x and Y$ = y. Then 

||P*(x,-)-P t (y,-)|| TV <Px, y (X t ^Y t ) . 
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